Coarse-grained quantum state estimation for noisy measurements 
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We introduce a straightforward numerical coarse-graining scheme to estimate quantum states 
for a set of noisy measurement outcomes, which are difficult to calibrate, that is based solely on 
the measurement data collected from these outcomes. This scheme involves the maximization of a 
weighted entropy function that is simple to implement and can readily be extended to any number 
of ill-calibrated noisy outcomes in a measurement set-up, thus offering practical applicability for 
general tomography experiments without additional knowledge or assumptions about the structures 
of the noisy outcomes. Simulation results for two-qubit quantum states show that coarse-graining 
can improve the tomographic efficiencies for noise levels ranging from low to moderately high values. 
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Quantum state tomography is one of the standard pro- 
tocols for determining the integrity of the quantum state 
Ptruo of a source. Typically, a set of probability opera- 
tor measurement (POM) {II?}, with outcomes IT,- > 0, 
is designed to measure a collection of quantum systems 
that are produced by the source. The measurement data 
collected are then the numbers of occurrences n? of all 
the outcomes IT, . With these data, an estimator p [l|] for 
the source can be inferred and used as a book-keeping 
device to predict probabilities for future measurements 
or expectation values of any observable. Hypothetically, 
if the number of copies N = y~] ■ rij measured approaches 
infinity, the estimator p would essentially be the quan- 
tum state ptruc since the measured frequencies fj = rij/N 
tend to the true probabilities pj. In a realistic experi- 
mental scenario, however, the number N is always finite 
and the resulting frequencies will not in general be physi- 
cal probabilities. As such, we require more sophisticated 
methods of state estimation to ensure that the resulting 
estimator is positive. 

Usually for a given tomography experiment, the POM 
that is used to perform the measurement is not exactly 
the intended POM of interest, owing to external random 
noise or systematic errors that perturb the measurement 
outcomes. One would need to calibrate these measure- 
ment outcomes before they can be used to reconstruct 
the unknown quantum state. Such calibrations are car- 
ried out by carefully performing separate experiments us- 
ing well-defined probes to analyze the characteristics of 
the measurement [2| , which may be accompanied by dis- 
tribution modeling for the external noise [3j. Ideally, if 
precise calibrations can be done for all the measurement 
outcomes, there exist quantum state estimation schemes 
available to reconstruct the state [J-|6(. 

In this Letter, we are discussing state estimation for 
the case in which some, if not all, of the POM outcomes 
are difficult to calibrate. This can happen, for instance, 
if the noise perturbation evolves in such a way that there 
is no known distribution to describe such an evolution, 
or when hardware constraints simply render the task of 



calibration almost impossible. Under some assumptions 
about the noise and stability of the measurement out- 
comes, there exist self-calibrating techniques that simul- 
taneously estimate the quantum state and certain as- 
pects of the outcomes [7|j. Using only the data {rij}, our 
aim is to develop a straightforward numerical scheme to 
perform reliable quantum state estimation without any 
knowledge or assumptions about the noise distribution 
and the fine details of the noisy outcomes. 

Throughout the discussion, we shall denote the well- 
calibrated POM outcomes as IT W ^ and the ill-calibrated 

POM outcomes as II^ 1 . For the purpose of this dis- 
cussion, we shall use the popular maximum-likelihood 
(ML) technique [4] for state estimation. Suppose that 
out of a total of M POM outcomes measured, only 
M\ of them are well calibrated and M2 — M — M\ 
outcomes are ill-calibrated and unknown. The likeli- 
hood functional for such a set of measurement outcomes 
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trjpIT^}, pjjp = trjpnf } and 7? = 

J2jPj + J2kPk ■ Let us also suppose that to ev- 
ery ill-calibrated outcome IIj, ' , there is a correspond- 
ing outcome 11/. that one had intended to design and 
that the noisy outcome 11), is a result of noise pertur- 
bation on life. Other than this, no other assumptions 
are made regarding the actual measurement outcomes. 
Without loss of generality, we shall take the outcomes 
J2j hL + J2k life < 1 to be informationally complete, 
that is, these outcomes define a unique ML estimator. 
In principle, the subsequent arguments can be applied to 
informationally incomplete measurements. 

Ideally, if we know precisely the identities of all the 



outcomes, the actual ML estimator p ML can be recon- 
structed. Ruling out the calibration of the Iljj/ s as a 
viable option, one can conceive at least two strategies to 
go about estimating the state of the source in this situa- 
tion. The first strategy is to simply take the data, neglect 
any noise in the system and reconstruct the ML estima- 
tor pJ L aw by taking Iljjp = II fc (Strategy 1), hoping that 

pj aw is close to p Mh . If the actual H k s deviate from the 
respective n^s significantly, this strategy will not give 
an accurate estimator in general. The second strategy 
would be, since we are completely ignorant about the 
n^s, to discard all data obtained from measuring these 
ill-calibrated outcomes and use the rest to reconstruct the 
state (Strategy 2). Depending on the outcomes IL W , 
the data may not be informationally complete. If so, 
there will in general be a convex set of estimators that 
give the same estimated probabilities. There are many 
ways of choosing a specific estimator from this set, none 
of which can systematically single out one that is close to 
p ML without additional information. Furthermore, if all 
the outcomes are not well calibrated, this strategy cannot 
be used. 

We thus need to resort to another strategy and before 
we begin our modest attempt, we acknowledge that rea- 
sonable estimates for the EL s over the admissible space 
of POM outcomes are required to estimate p ML accu- 
rately. This is difficult to do given that the only available 
information is the set D and the complexity of this space. 
Instead, we take a different route and first carry out the 
replacement 
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with n = YsiPj + Y,kPk and p k = tr{pll k }, for the 
term in Eq. |lj that is contributed by the ill-calibrated 
H k . s, so that maximizing the resulting new likelihood 
functional will still give the same ML estimator p ML as 
maximizing the original one in Eq. ([1]). In this way, we 
forgo the problem of estimating the EL s and turn to a 
fundamentally different problem of estimating the posi- 
tive n' k s, with the latter glossing over the operator de- 
tails of the measurement and focusing on a reduced set 
of parameters. In this sense, this procedure results in a 
coarse-grained parameter estimation. With only limited 
data, it is impossible to precisely obtain the correct n' k s 
for which the replacement in Eq. (0) leads to exactly p ML . 

We will, next, estimate the n' k s using the data <^ n k > and 
restrict the estimation using the relation 

Mi M 2 Mi M 2 

^ = E-5 w) + E4 i) = E4 w) +E<- ( 3 ) 
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set {n' k } with the raw data i n k > for the ill-calibrated 

outcomes and use the result, together with < n w >, to 

reconstruct an estimator p^ (Strategy 3) using the in- 
tended outcomes life. This strategy is meaningful only if 
the noise perturbation is not too large. 

We can now think of the raw data n k as a summary of 
our present subjective knowledge about the source and 
ill-calibrated outcomes. To estimate the parameters n' k 
for noise perturbation that is not too large, we will uti- 
lize a simple statistical inference method that takes this 
subjective knowledge into account. For small N, it can 
happen that n k = for some k. In this case, we simply 
set the corresponding n' k = since any nonzero values 
require justification the data cannot provide. Estimation 
shall then be performed on the n' k s for which the n k s 
are strictly positive. By normalizing the two sets of pa- 
rameters < n k > and {n' k } with Nm = J2k n k ■> we define 
the weighted entropy function 
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This technique of coarse graining, which shall be our 
third strategy, attempts to perform an estimation on the 



to hold all information about the source and measure- 
ment, which is characterized by the parameters v' k — 
n' k /Nni to be estimated, with each outcome detection 
weighted by the respective subjective knowledge that is 
gained from the Af>o < M2 positive v k = n k /N\\\. 
The relation between the concept of information and 
the weighted entropy function in Eq^ ^ was previ- 
ously introduced and studied in Ref. |8|. The function 
W-t (0\ ({^fc}) can a l so be regarded as a measure of the 

uncertainty of {v' k } given < v k >. One approach of esti- 
mating the n fe s, which we shall henceforth consider here, 
is to maximize the weighted entropy function with the 
constraint that Y] k v' k = 1. The principle of such a 
weighted entropy maximization has already been used 
in the fields of Economics, Genetics and data pattern 
recognition [9]. In effect, by maximizing the uncertainty 
with respect to our subjective knowledge about the ill- 
calibrated outcomes, we are searching for the parameters 
n' k that form the least-biased set with respect to the data 
n k Q . Based on this criterion, we deem this set as a con- 
servative guess of the actual data that one would obtain 
if the intended outcomes IL are used for measurement. 
As the weighted entropy function T-Lr (i )-i ({t^}) is a 

concave function in v' k , it always gives a unique solution 
for its maximum and there is, hence, no ambiguity in the 
parameter estimation. Moreover, there is an arsenal of 
efficient nonlinear optimization methods for such simple 
convex functions. There are two other features in this 
coarse-grained maximum weighted entropy (MWE) esti- 
mation scheme. Firstly, this scheme can be applied to any 
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FIG. 1. Plots of the average trace-class distance over 20 experiments against the concurrence for 250 randomly generated pure 
states at noise levels p — 0.1, 0.3 and 0.5. For moderately high values of p, the estimators p^ L g (Green ■) are almost always 
closer to p ML than the estimators p^f™ (Red A). 



number of ill-calibrated outcomes Mi . This is particular 
useful if one believes that the entire set of measurement 
outcomes are noisy and not well calibrated, in which case 
he may choose Mi = M to perform the coarse-grained 
MWE estimation. Secondly, this coarse-grained scheme 
does not rely on the details of the noisy outcomes. Such 
versatility permits its application to very general experi- 
mental situations without assuming any additional struc- 
tures whatsoever about the noise and the ill-calibrated 
outcomes. From the MWE frequencies ^ IWE that maxi- 
mize the weighted entropy function in Eq. (j4|), we obtain 

the set D MWE = {nj w) ; < WE }, with < WE = N m ^ WE , and 
use it to reconstruct an ML estimator by maximizing the 
new likelihood functional 



surement. To model a varying noise perturbation, a dif- 
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To put the third strategy to the test, we perform simu- 
lations involving noisy measurement outcomes. We first 
randomly generate a fixed set of M = D 2 rank-one posi- 
tive operators IT; such that J^i II; = 1, with D being the 
dimension of the Hilbert space of interest and I € [1, M\. 
Next, we select the first M\ of these operators and define 
them to be the well-calibrated outcomes II;;. = 7VIT,. 
From the rest of the Mi operators which we define to be 
the intended outcomes life = MTik+M t , we construct the 
ill-calibrated outcomes as 



n«=7V (i- M )n fc+Ml +MPfe° 



(6) 



where p1° lsc is a randomly generated full-rank state with 
respect to the Hilbert-Schmidt measure and < /i < 1 
quantifies the noise level of the POM. The positive pref- 
actor TV" ensures that the outcomes form a valid mea- 



ferent 



P'k 



is assigned to each experiment. We focus 



on two-qubit state estimation (D — 4) on random two- 
qubit true quantum states ptruc generated with respect 
to the Hilbert-Schmidt measure. For each ptruc, 20 ex- 
periments are simulated, where TV = 8000 copies of two- 
qubits are measured using the generated POM outcomes 
I jj W . jjW I m eacn experiment. 

For analysis, we suppose a typical situation in which 
Mi = M, so that Strategy 2 becomes impractical. We 
first compare the average performances of Strategy 1 and 
Strategy 3 for 250 random pure states. Figure Q] shows 
plots of the average trace-class distances between the es- 
timator obtained with the first or the third strategy and 
p ML , which is the ML estimator obtained with the actual 
POM measured. It is clear that when fi = 0, the trace- 
class distances between p^ w and p ML will always be zero 
and Strategy 3 is not needed. For non-zero pi, provided 
that fj, is not too large, Strategy 3 turns out to be a better 
choice for state estimation than Strategy 1 on average, 
with percentage improvements that can exceed 50%. For 
|U > 0.55, both strategies give very poor tomographic 
efficiencies since the data are too unreliable. 

Next, it is interesting to look at the performance of 
Strategy 3 with mixed states. For this, we generate an 
ensemble of mixed states by taking each random pure 
state ptruc that is previously generated and admix to it 
the maximally-mixed state inasmuch as 
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with 7 quantifying the amount of admixture. Figure [2] 
shows performance plots for three different values of ad- 
mixtures. The drop in average performance of Strategy 3 
with increasing 7 can be understood from the behavior 




Strategy 1 
- Strategy li 



FIG. 2. Plots of percentage of total random states that respond better to Strategy 3 than Strategy 1 (Left) , average and standard 
deviation of percentage improvement (Center) and average trace-class distances with Strategy 1 and Strategy 3 (Right) for 
different noise levels fi and mixed quantum states of admixtures 7 = (O), 0.1 (□) and 0.2 (A). Nearly-pure mixed states yield 
comparable results. The intersections of the standard deviation curve and the average curve in (Center) for each 7 serve as a 
gauge for the performance range 7?. 7 of Strategy 3. They are 72-o ~ [0.05, 0.55], TJ-o.i ~ [0.05, 0.5] and 7£o.2 ~ [0.05, 0.45]. 



that maximizing T-L< w -i ({^}) as stated in Eq. (H|) tends 

to amplify the v' k s with large weights and reduce those 
with small weights (refer to the second article in [8|). 
As a result, this strategy can give very small n' k s, es- 
pecially when the relative weight differences are large. 
Maximizing C (D MWE ; p) then gives estimators that are 
nearly rank-deficient. Therefore, Strategy 3 is biased to- 
wards highly-pure estimators, which explains its overall 
effectiveness on pure true states for low to moderate /is. 
Without going into the details, we remark that mixed 
Pml s of lower purity can be obtained by using adjustable 

weight factors f z/j. ) , with t typically of values smaller 
than one, to reduce the relative weight differences appro- 
priately. This has been tested to work in a preliminary 
investigation on experimental data for two-photon mixed 
states [lfj ■ Coarse graining for mixed states and partially 
calibrated measurements will be reported in the future. 

In conclusion, we have established a straightforward 
coarse-graining numerical method that can give accurate 
ML estimators for mixed quantum states that are nearly 
pure and ill-calibrated measurement outcomes of low and 
moderately high noise levels. This method employs the 
maximization of the weighted entropy in Eq. (U) that re- 
quires no additional information about the the noise or 
the actual POM and can be applied to very general sit- 
uations. In order to decide if coarse-grained MWE is 
suitable for a given set of experimental data from suf- 
ficiently large number of copies, one requires confidence 
that the unknown true state of the source is in the vicin- 
ity of the quantum state of interest, so that a credible 
comparison of the different strategies mentioned earlier 
can be made with respect to this state. Such an expec- 
tation is usually not too demanding as one usually has 
some prior knowledge about the source he is preparing, 
which can for instance be obtained through observations 
of physical aspects of the set-up. Finally, we would like 
to briefly mention that two other weighted entropy func- 
tions have been proposed respectively in Refs. |ll| and 
[12j . Experience shows that maximizing the weighted 



entropy function in Eq. ^ gives more accurate ML esti- 
mators as the other two are less sensitive to the data and 
often yield rather flat distributions of parameters. 
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